Week 4.3 - Transparency, Authorship and Integrity

What We'll Cover

This session moves from ethical theory to the practical questions every researcher using AI tools must navigate: When and how should you disclose AI use? Can AI be listed as an author? How does bias manifest in AI-assisted research? What happens to your data when you use cloud-based AI? Where does legitimate tool use end and academic misconduct begin?

These are questions with evolving answers. Journal policies, institutional guidelines, and legal frameworks are all in flux. The goal is not to give you definitive rules — it is to equip you to make informed, defensible decisions as the landscape shifts.

📝 Transparency: When and How to Disclose AI Use

Disclosure is the foundation of trust in AI-assisted research. But what exactly should you disclose, and where?

The Case for Disclosure

Reproducibility: Other researchers need to know what tools were used to evaluate and replicate your work. "AI-assisted" covers a vast range of practices
Trust: Readers assess credibility differently when they know AI was involved. Undisclosed AI use, if later discovered, damages trust far more than transparent use
Accountability: If AI-generated content contains errors, biases, or fabrications, disclosure clarifies who is responsible (the researcher who chose to use and present the output)
Norm-setting: Current disclosure practices establish the expectations that will govern AI use in your field for years to come

What and How to Disclose

What to include:

Which AI tool(s) were used (name, version if known, date of use)
What they were used for (ideation, drafting, editing, analysis, coding, literature search)
The degree of human modification of AI outputs
Any prompts used (increasingly expected in some fields)

Where to disclose:

Methods section (for substantive AI use in research process)
Acknowledgements (for editorial assistance)
Author contribution statements (where applicable)
Supplementary materials (for detailed prompts or interaction logs)

⚠️ A Moving Target

Disclosure norms are evolving rapidly. What is considered adequate disclosure today may be considered insufficient in a year. Always check your target journal's most recent author guidelines before submission — policies are updated frequently, and cached or secondary sources may be outdated.

📄 Key Reading

Mollick, E. (2023): "On AI and the Ethics of Disclosure" — A thoughtful, practical perspective on when and how to disclose AI use, from one of the most widely read writers on AI in professional contexts.

📋 Journal Policies: The Current Landscape

Major publishers have issued AI use policies, but they differ in specifics. The table below captures the current state — check for updates before relying on it.

📊 AI Policies of Major Publishers (as of early 2026)

Publisher / Journal	AI Authorship?	Disclosure Required?	Key Policy Details
Nature / Springer Nature	No	Yes — in Methods	AI tools cannot be credited as authors. Use must be documented in methods or acknowledgements. Authors take full responsibility for AI-assisted content.
Science / AAAS	No	Yes — in Methods	Text generated by AI must be disclosed. AI cannot fulfil the role of author. Figures and data generated by AI must be clearly identified.
IEEE	No	Yes	Detailed guidance on disclosure. AI-generated text must be clearly attributed. Authors must verify all AI-assisted content for accuracy.
ACM	No	Yes	2024 policy with specific requirements. Authors must describe the nature and extent of AI use. AI cannot meet authorship criteria.
Elsevier	No	Yes	AI and AI-assisted technologies cannot be listed as authors. Use must be disclosed. Authors are responsible for the accuracy of AI-generated content.
PLOS	No	Yes	Requires specific language about AI use in methods section. Authors must take responsibility for AI-assisted work.

Note: All major publishers have converged on two points: (1) AI cannot be listed as an author, and (2) AI use must be disclosed. The specifics of how to disclose vary. Always check the most recent version of author guidelines.

📄 Key Readings

Nature Editorial (2023): "Tools Such as ChatGPT Threaten Transparent Science; Here Are Our Ground Rules" — The influential early statement that shaped many subsequent policies.

ACM (2024): "Policy on Authorship and Use of Generative AI and Large Language Models" — A detailed, discipline-specific policy worth examining closely as a model.

📝 Activity: Journal Policy Audit

Survey the AI disclosure policies of the top 5 journals in your field. For each journal: find the current AI/LLM policy (or note its absence), document what is required for disclosure, note the authorship rules, and identify anything ambiguous or missing. This exercise will be discussed in class — bring your findings.

✍️ Can AI Be an Author?

The authorship question has generated intense debate. The current consensus is clear — but the reasoning behind it is worth examining.

The Case Against AI Authorship

The dominant position, reflected in all major publisher policies:

Accountability: Authorship implies legal and ethical responsibility for the work. AI cannot be held accountable for errors, fabrications, or misconduct
ICMJE criteria: The widely adopted criteria require (1) substantial contribution, (2) drafting or critical revision, (3) final approval, and (4) accountability for accuracy and integrity. AI fails criterion 4 definitively
Credit: Authorship confers academic credit — citations, h-index, career advancement. AI does not benefit from or need these
Precedent: Listing AI as an author normalises the idea that non-human entities can bear scholarly responsibility — a precedent with uncertain consequences

The Complications

Even if the current consensus is correct, the reasoning is worth interrogating:

Substantial contribution: In some workflows, AI makes genuinely substantial intellectual contributions — structuring arguments, identifying patterns, generating hypotheses. The "tool" framing may understate this
Ghostwriting precedent: Human ghostwriting — where the named author did not write the text — is accepted in some contexts (political speeches, some medical writing). How is AI-generated text categorically different?
Contribution statements: Rather than authorship, detailed contribution statements could honestly record what AI did and what humans did — a more transparent approach than either listing AI as author or pretending it played no role
Evolving capabilities: As AI systems become more capable, the assumptions behind current policies may need revisiting

💡 Where Things Stand

The current consensus (2025–2026) is clear: AI cannot be listed as an author on academic publications. Every major publisher has taken this position. But the reasoning behind this consensus — centred on accountability and the nature of authorship itself — reveals assumptions that may need updating as AI capabilities evolve and as our understanding of human-AI collaboration deepens.

For now, the practical advice is straightforward: do not list AI as an author. Use disclosure statements and contribution descriptions to be transparent about AI's role.

⚖️ Bias in AI-Assisted Research

"Bias in, bias out" — but in research contexts, the manifestations can be subtle and the consequences significant.

Sources of Bias

AI systems trained on large datasets inherit and can amplify the biases present in that data:

Language bias: Training data overwhelmingly in English, underrepresenting non-English scholarship and perspectives
Geographic bias: Data disproportionately from the Global North — US, Europe, and East Asia
Temporal bias: Training data has a cutoff date; recent developments may be absent or underrepresented
Demographic bias: Underrepresentation of minority groups, indigenous communities, and marginalised populations
Publication bias: Models trained on published work inherit the biases of what gets published — positive results, established methodologies, dominant theoretical frameworks

How Bias Manifests in Research

Literature review: AI may systematically overlook non-English sources, regional journals, or scholarship from underrepresented communities
Coding and analysis: AI may apply analytical categories developed in one context (typically Western, English-speaking) to data from another
Writing: AI tends to adopt dominant discourse patterns, potentially flattening disciplinary and cultural diversity in how research is communicated
Recommendations: AI may reinforce mainstream approaches over novel, interdisciplinary, or locally grounded methodologies
Data interpretation: AI may impose patterns or categories that do not fit the data, producing confident but misleading analyses

Mitigation Strategies

Awareness: Know what biases are likely in your tools — this requires understanding how they were trained (which is often only partially disclosed)
Cross-checking: Use multiple AI tools and compare results with manual methods. Where they diverge, investigate why
Diverse prompting: Explicitly ask for non-dominant perspectives, non-English sources, alternative frameworks
Domain expertise: Human judgment remains essential for detecting bias — AI cannot reliably identify its own blind spots
Transparency: Report which AI tools were used so that readers and reviewers can assess bias risk in your specific context

⚠️ The Invisible Risk

AI-generated bias is not always visible. AI can produce confident, well-structured, fluent output that contains systematic omissions or distortions. The risk is particularly high in qualitative research, where AI may impose interpretive categories that do not fit the data — and the researcher may not notice because the output reads so convincingly. Treat AI outputs with the same critical scrutiny you would apply to any research assistant's work — and more, because the assistant's biases are structural rather than idiosyncratic.

🔒 Privacy: What Happens to Your Data?

When you paste text into a cloud-based AI tool, where does it go — and who else might see it?

Data Handling by AI Providers

Policies vary across providers and tiers, and change frequently. Key considerations:

Training use: Some consumer-tier services may use conversation data to improve their models (unless the user opts out). Enterprise and API tiers typically have stronger protections
Data retention: Providers retain conversation data for varying periods — from 30 days to indefinitely, depending on the service and tier
No guaranteed deletion: Even when providers offer deletion, it may not apply to data already incorporated into model training
Jurisdictional issues: Your data may be processed and stored in a different country, subject to different privacy laws

Implications for Researchers

The privacy question is not hypothetical — it has real consequences for research practice:

Participant data: Pasting interview transcripts, survey responses, or medical records into consumer AI tools may violate your ethics approval and data protection commitments
Unpublished findings: Entering draft manuscripts or preliminary results into AI tools exposes them to potential data breaches before publication
Proprietary data: Industry-funded research may involve confidentiality agreements that prohibit sharing data with third-party AI services
IRB/ethics implications: Your ethics committee likely did not anticipate AI tool use when they approved your protocol — you may need to seek amended approval

💡 Practical Guidance

For sensitive research data: use enterprise tiers with explicit data protection guarantees; anonymise data before using AI tools; check your university's data handling policy and your ethics approval conditions; consider running local/open-weight models for sensitive analysis (e.g., LLaMA, Mistral on your own hardware); and when in doubt, ask your ethics committee before proceeding.

🎓 The Academic Integrity Spectrum

AI use in research is not binary — legitimate or illegitimate. It exists on a spectrum, with a large grey zone that requires judgment.

The Spectrum of AI Use in Research

Category	Examples	Key Consideration
Generally Accepted	Grammar and spell checking; code debugging and syntax help; brainstorming and ideation; literature search and discovery; translation assistance	These uses are widely considered analogous to existing tools (spell checkers, search engines). Disclosure is good practice but the ethical risk is low.
Grey Zone	Drafting sections that are then heavily edited; generating initial analyses that are verified and refined; restructuring arguments; summarising literature; code generation for novel analyses	These uses involve more substantial AI contribution. Disclosure is essential. The key question: does the researcher maintain genuine intellectual command of the work?
Generally Problematic	Submitting AI-generated text as one's own without disclosure; fabricating or augmenting data using AI; using AI to bypass learning objectives; allowing AI to make substantive research decisions without human oversight	These uses involve misrepresentation, fabrication, or abdication of intellectual responsibility. Most institutions and publishers would consider these misconduct.

💡 The Key Distinction

The critical question is not "did you use AI?" — it is "did you use AI honestly and transparently, maintaining your intellectual responsibility for the work?" A researcher who uses AI to draft text, carefully verifies and revises it, and discloses the process is acting with integrity. A researcher who submits AI output as their own work, unchecked and undisclosed, is not — regardless of how good the output is.

⚙️ Intellectual Property: Who Owns AI-Generated Content?

The legal landscape around AI-generated content and copyright is evolving rapidly and remains largely unsettled.

The Current Legal Landscape

US Copyright Office: Content generated by AI without human authorship is not copyrightable. However, human-directed AI use — where the human exercises substantial creative control — may produce copyrightable work
The boundary is unclear: How much human creative input is needed to make AI-assisted work copyrightable? Courts are still working this out
Other jurisdictions: The EU, UK, and other jurisdictions are developing their own approaches, which may differ significantly from the US position
Rapid evolution: Multiple court cases are currently in progress that could reshape the legal landscape significantly

Implications for Researchers

Copyright claims: If your AI-assisted work is not copyrightable, this could affect publication agreements, licensing, and IP ownership disputes
University IP policies: Many university IP policies were written before AI-generated content was a consideration — they may not clearly address who owns AI-assisted research outputs
Practical advice: Ensure sufficient human creative input and intellectual direction to maintain copyright claims. Document your creative process
Training data concerns: Separate from output copyright, there are ongoing legal disputes about whether AI companies had the right to use copyrighted material in training data

📄 Further Reading

Lund, B.D., et al. (2023): "ChatGPT and a New Academic Reality" — JASIST. A comprehensive overview of the academic integrity and authorship debates around AI.

van Dis, E.A.M., et al. (2023): "ChatGPT: Five Priorities for Research" — Nature, 614. Short, actionable priorities for the research community.

📚 Summary & Key Takeaways

This session covered the practical ethical landscape that every researcher using AI must navigate.

Disclosure is foundational: Transparency about AI use builds trust and enables reproducibility. Norms are converging toward more, not less, disclosure
Journal policies are converging: All major publishers agree — AI cannot be an author, and AI use must be disclosed. The specifics of how to disclose vary
The authorship consensus is clear (for now): AI cannot meet the accountability requirement for authorship. But the reasoning is worth examining because the underlying assumptions may need updating
Bias is systematic and subtle: AI-assisted research can contain invisible distortions — language bias, geographic bias, publication bias — that require active mitigation
Privacy requires institutional attention: Sensitive research data should not be entered into consumer AI tools without careful consideration of data handling policies and ethics approval
Integrity is a spectrum: The question is not whether you used AI, but whether you used it honestly, transparently, and with genuine intellectual responsibility
IP law is uncertain: The copyright status of AI-assisted work remains legally unsettled — document your creative process and ensure substantial human input

Next session: We put all of this into practice — working through four detailed case studies using the ethical lenses from Sub-Lesson 1 and the practical knowledge from this session, and beginning your personal ethical framework.